{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# 第4章 朴素贝叶斯法"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## 习题4.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "&emsp;&emsp;用极大似然估计法推出朴素贝叶斯法中的概率估计公式(4.8)及公式 (4.9)。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答：**  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答思路：**\n",
    "1. 极大似然估计的一般步骤（详见习题1.1第3步）\n",
    "2. 证明公式4.8：根据输出空间$\\mathcal{Y}$的随机变量$Y$满足独立同分布，列出似然函数，求解概率$P(Y=c_k)$的值；\n",
    "3. 证明公式4.9：证明同公式4.8。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答步骤：**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**第1步：极大似然估计的一般步骤** "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 参考Wiki：https://en.wikipedia.org/wiki/Maximum_likelihood_estimation   \n",
    "> 1. 写出随机变量的概率分布函数；  \n",
    "> 2. 写出似然函数；\n",
    "> 3. 对似然函数取对数，得到对数似然函数，并进行化简；\n",
    "> 4. 对参数进行求导，并令导数等于0；\n",
    "> 5. 求解似然函数方程，得到参数的值。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**第2步：证明公式(4.8)** \n",
    "$$\n",
    "\\displaystyle P(Y=c_k) = \\frac{\\displaystyle \\sum_{i=1}^N I(y_i=c_k)}{N}, \\quad k=1,2,\\ldots,K\n",
    "$$   "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;&emsp;根据书中第4章的第4.1节朴素贝叶斯法的基本方法：\n",
    "> &emsp;&emsp;设输入空间$\\mathcal{X} \\subseteq R^n$为$n$维向量的集合，输出空间为类标记集合$\\mathcal{Y}=\\{c_1,c_2,\\ldots,c_k\\}$。输入为特征向量$x \\in \\mathcal{X}$，输出为类标记$y \\in \\mathcal{Y}$。$X$是定义在输入空间$\\mathcal{X}$上的随机向量，$Y$是定义在输出空间$\\mathcal{Y}$上的随机变量。$P(X,Y)$是$X$和$Y$的联合概率分布。训练数据集\n",
    "> $$ T=\\{(x_1,y_1),(x_2,y_2),\\ldots,(x_N,y_N)\\} $$\n",
    "> 由$P(X,Y)$独立同分布产生。  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;&emsp;根据上述定义，$Y={y_1,y_2,\\ldots,y_N}$满足独立同分布，假设$P(Y=c_k)$概率为$p$，其中$c_k$在随机变量$Y$中出现的次数$\\displaystyle m=\\sum_{i=1}^NI(y_i=c_k)$，可得似然函数为：\n",
    "$$\n",
    "\\begin{aligned} L(p|Y) &= f(Y|p) \\\\\n",
    "&= C_N^m p^m (1-p)^{N-m}\n",
    "\\end{aligned}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对似然函数取对数，得到对数似然函数为：\n",
    "$$\n",
    "\\begin{aligned} \\displaystyle \\log L(p|Y) &= \\log C_N^m p^m (1-p)^{N-m} \\\\\n",
    "&= \\log C_N^m + \\log(p^m) + \\log\\left( (1-p)^{N-m} \\right) \\\\\n",
    "&= \\log C_N^m + m\\log p + (N-m)\\log (1-p)\n",
    "\\end{aligned}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "求解参数$p$：\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\hat{p} &= \\mathop{\\arg\\max} \\limits_{p} L(p|Y) \\\\\n",
    "&= \\mathop{\\arg\\max} \\limits_{p} \\left[\\log C_N^m + m\\log p + (N-m)\\log (1-p) \\right]\n",
    "\\end{aligned}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对参数$p$求导，并求解导数为0时的$p$值：\n",
    "$$\n",
    "\\begin{aligned}\n",
    "\\frac{\\partial \\log L(p)}{\\partial p} &= \\frac{m}{p} - \\frac{N-m}{1-p} \\\\\n",
    "&= \\frac{m(1-p) - p(N-m)}{p(1-p)} \\\\\n",
    "&= \\frac{m-Np}{p(1-p)} = 0\n",
    "\\end{aligned}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从上式可得，$m-Np=0$，即$\\displaystyle P(Y=c_k)=p=\\frac{m}{N}$  \n",
    "综上所述，$\\displaystyle P(Y=c_k)=p=\\frac{m}{N}=\\frac{\\displaystyle \\sum_{i=1}^N I(y_i=c_k)}{N}$，公式(4.8)得证。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**第3步：证明公式(4.9)** \n",
    "$$\\displaystyle P(X^{(j)}=a_{jl}|Y=c_k) = \\frac{\\displaystyle \\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k)}{\\displaystyle \\sum_{i=1}^N I(y_i=c_k)} \\\\\n",
    "j=1,2,\\ldots,n; \\quad l = 1,2,\\ldots,S_j; \\quad k = 1,2,\\dots,K\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;&emsp;根据书中第4章朴素贝叶斯法的条件独立性假设：\n",
    "> &emsp;&emsp;朴素贝叶斯法对条件概率分布作了条件独立性的假设。由于这是一个较强的假设，朴素贝叶斯法也由此得名。具体地，条件独立性假设是：\n",
    "> $$ \\begin{aligned}\n",
    "P(X=x|Y=c_k) &= P(X^{(1)}=x^{(1)},\\ldots,X^{(n)}=x^{(n)}|Y=c_k) \\\\\n",
    "&= \\prod_{j=1}^n P(X^{(j)}=x^{(j)}|Y=c_k)\n",
    "\\end{aligned} \\tag{4.3} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;&emsp;根据上述定义，在条件$Y=c_k$下，随机变量$X$满足条件独立性，假设$P(X^{(j)}=a_{jl}|Y=c_k)$概率为$p$，其中$c_k$在随机变量$Y$中出现的次数$\\displaystyle m=\\sum_{i=1}^NI(y_i=c_k)$，$y=c_k$和$x^{(j)}=a_{jl}$同时出现的次数$\\displaystyle q=\\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k)$，可得似然函数为：  \n",
    "$$\n",
    "\\begin{aligned} L(p|X,Y) &= f(X,Y|p) \\\\\n",
    "&= C_m^q p^q (1-p)^{m-q}\n",
    "\\end{aligned}\n",
    "$$\n",
    "与第2步推导过程类似，可求解得到$\\displaystyle p=\\frac{q}{m}$  \n",
    "综上所述，$\\displaystyle P(X^{(j)}=a_{jl}|Y=c_k)=p=\\frac{q}{m}=\\frac{\\displaystyle \\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k)}{\\displaystyle \\sum_{i=1}^N I(y_i=c_k)}$，公式(4.9)得证。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 习题4.2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "&emsp;&emsp;用贝叶斯估计法推出朴素贝叶斯法中的慨率估计公式(4.10)及公式(4.11)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答：** "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答思路：**\n",
    "1. 贝叶斯估计的一般步骤（详见习题1.1第4步）；\n",
    "2. 证明公式4.11：假设概率$P_{\\lambda}(Y=c_i)$服从狄利克雷（Dirichlet）分布，根据贝叶斯公式，推导后验概率也服从Dirichlet分布，求参数期望；\n",
    "3. 证明公式4.10：证明同公式4.11。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**解答步骤：**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**第1步：贝叶斯估计的一般步骤**  \n",
    "参考Wiki：https://en.wikipedia.org/wiki/Bayes_estimator\n",
    "> 1. 确定参数$\\theta$的先验概率$p(\\theta)$\n",
    "> 2. 根据样本集$D={x_1,x_2,\\ldots,x_n}$，计算似然函数$P(D|\\theta)$：$\\displaystyle P(D|\\theta)=\\prod_{i=1}^n P(x_n|D)$\n",
    "> 3. 利用贝叶斯公式，求$\\theta$的后验概率：$\\displaystyle P(\\theta|D)=\\frac{P(D|\\theta)P(\\theta)}{\\displaystyle \\int \\limits_\\Theta P(D|\\theta) P(\\theta) d \\theta}$ \n",
    "> 4. 计算后验概率分布参数$\\theta$的期望，并求出贝叶斯估计值：$\\displaystyle \\hat{\\theta}=\\int \\limits_{\\Theta} \\theta \\cdot P(\\theta|D) d \\theta$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**第2步：证明公式(4.11)**\n",
    "$$\n",
    "\\displaystyle P_\\lambda(Y=c_k) = \\frac{\\displaystyle \\sum_{i=1}^N I(y_i=c_k) + \\lambda}{N+K \\lambda}, \\quad k= 1,2,\\ldots,K\n",
    "$$  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "证明思路：\n",
    "1. 条件假设：$P_\\lambda(Y=c_k)=u_k$，且服从参数为$\\lambda$的Dirichlet分布；随机变量$Y$出现$y=c_k$的次数为$m_k$； \n",
    "2. 得到$u$的先验概率$P(u)$；\n",
    "3. 得到似然函数$P(m|u)$；\n",
    "4. 根据贝叶斯公式，计算后验概率$P(u|m)$\n",
    "5. 计算$u$的期望$E(u)$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "证明步骤："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. 条件假设  \n",
    "\n",
    "&emsp;&emsp;根据朴素贝叶斯法的基本方法，训练数据集$T=\\{(x_1,y_1),(x_2,y_2),\\ldots,(x_N,y_N)\\}$，假设：  \n",
    "（1）随机变量$Y$出现$y=c_k$的次数为$m_k$，即$\\displaystyle m_k=\\sum_{i=1}^N I(y_i=c_k)$，可知$\\displaystyle \\sum_{k=1}^K m_k = N$（$y$总共有$N$个）；  \n",
    "（2）$P_\\lambda(Y=c_k)=u_k$，随机变量$u_k$服从参数为$\\lambda$的Dirichlet分布。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> **补充说明：**  \n",
    "> 1. 狄利克雷(Dirichlet)分布  \n",
    "&emsp;&emsp;参考PRML（Pattern Recognition and Machine Learning）一书的第2.2.1章节：⽤似然函数(2.34)乘以先验(2.38)，我们得到了参数$\\{u_k\\}$的后验分布，形式为\n",
    "> $$ p(u|D,\\alpha) \\propto p(D|u)p(u|\\alpha) \\propto \\prod_{k=1}^K u_k^{\\alpha_k+m_k-1} $$\n",
    "> &emsp;&emsp;该书中第B.4章节：\n",
    "狄利克雷分布是$K$个随机变量$0 \\leqslant u_k \\leqslant 1$的多变量分布，其中$k=1,2,\\ldots,K$，并满足以下约束\n",
    "> $$ 0 \\leqslant u_k \\leqslant 1, \\quad \\sum_{k=1}^K u_k = 1 $$\n",
    "记$u=(u_1,\\ldots,u_K)^T, \\alpha=(\\alpha_1,\\ldots,\\alpha_K)^T$，有\n",
    "> $$ \\text{Dir}(u|\\alpha) = C(\\alpha) \\prod_{k-1}^K u_k^{\\alpha_k - 1} \\\\\n",
    "E(u_k) = \\frac{\\alpha_k}{\\displaystyle \\sum_{k=1}^K \\alpha_k} $$\n",
    "> 2. 为什么假设$Y=c_k$的概率服从Dirichlet分布？  \n",
    "答：原因如下：  \n",
    "（1）首先，根据PRML第B.4章节，Dirichlet分布是Beta分布的推广。  \n",
    "（2）由于，Beta分布是二项式分布的共轭分布，Dirichlet分布是多项式分布的共轭分布。Dirichlet分布可以看作是“分布的分布”；  \n",
    "（3）又因为，Beta分布与Dirichlet分布都是先验共轭的，意味着先验概率和后验概率属于同一个分布。当假设为Beta分布或者Dirichlet分布时，通过获得大量的观测数据，进行数据分布的调整，使得计算出来的概率越来越接近真实值。  \n",
    "（4）因此，对于一个概率未知的事件，Beta分布或Dirichlet分布能作为表示该事件发生的概率的概率分布。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. 得到先验概率  \n",
    "&emsp;&emsp;根据假设(2)和Dirichlet分布的定义，可得先验概率为\n",
    "$$\n",
    "\\displaystyle P(u)=P(u_1,u_2,\\ldots,u_K) = C(\\lambda) \\prod_{k=1}^K u_k^{\\lambda - 1}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3. 得到似然函数  \n",
    "&emsp;&emsp;记$m=(m_1, m_2, \\ldots, m_K)^T$，可得似然函数为\n",
    "$$\n",
    "P(m|u) = u_1^{m_1} \\cdot u_2^{m_2} \\cdots u_K^{m_K} = \\prod_{k=1}^K u_k^{m_k}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4. 得到后验概率分布  \n",
    "&emsp;&emsp;结合贝叶斯公式，求$u$的后验概率分布，可得\n",
    "$$\n",
    "P(u|m) = \\frac{P(m|u)P(u)}{P(m)}\n",
    "$$\n",
    "&emsp;&emsp;根据假设(1)，可得\n",
    "$$\n",
    "P(u|m,\\lambda) \\propto P(m|u)P(u|\\lambda) \\propto \\prod_{k=1}^K u_k^{\\lambda+m_k-1}\n",
    "$$\n",
    "&emsp;&emsp;上式表明，后验概率分布$P(u|m,\\lambda)$也服从Dirichlet分布"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "5. 得到随机变量$u$的期望  \n",
    "&emsp;&emsp;根据后验概率分布$P(u|m,\\lambda)$和假设(1)，求随机变量$u$的期望，可得\n",
    "$$\n",
    "E(u_k) = \\frac{\\alpha_k}{\\displaystyle \\sum_{k=1}^K \\alpha_k}\n",
    "$$\n",
    "其中$\\alpha_k = \\lambda+m_k$，则\n",
    "$$\n",
    "\\begin{aligned}\n",
    "E(u_k) &= \\frac{\\alpha_k}{\\displaystyle \\sum_{k=1}^K \\alpha_k} \\\\\n",
    "&= \\frac{\\lambda+m_k}{\\displaystyle \\sum_{k=1}^K (\\lambda + m_k)} \\\\\n",
    "&= \\frac{\\lambda+m_k}{\\displaystyle \\sum_{k=1}^K \\lambda +\\sum_{k=1}^K m_k} \\quad(\\because \\sum_{k=1}^K m_k = N) \\\\\n",
    "&= \\frac{\\lambda+m_k}{\\displaystyle K \\lambda + N } \\quad (\\because m_k=\\sum_{i=1}^N I(y_i=c_k)) \\\\\n",
    "&= \\frac{\\displaystyle \\sum_{i=1}^N I(y_i=c_k) + \\lambda}{N+K \\lambda}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "&emsp;&emsp;随机变量$u_k$取$u_k$的期望，可得\n",
    "$\\displaystyle P_\\lambda(Y=c_k) = \\frac{\\displaystyle \\sum_{i=1}^N I(y_i=c_k) + \\lambda}{N+K \\lambda}$，公式(4.11)得证"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**第3步：证明公式(4.10)**\n",
    "\n",
    "$$\n",
    "\\displaystyle P_{\\lambda}(X^{(j)}=a_{jl} | Y = c_k) = \\frac{\\displaystyle \\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k) + \\lambda}{\\displaystyle \\sum_{i=1}^N I(y_i=c_k) + S_j \\lambda}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "证明思路：\n",
    "1. 条件假设：$P_{\\lambda}(X^{(j)}=a_{jl} | Y = c_k)=u_l$，其中$l=1,2,\\ldots,S_j$，且服从参数为$\\lambda$的Dirichlet分布；出现$x^{(j)}=a_{jl}, y=c_k$的次数为$m_l$； \n",
    "2. 得到$u$的先验概率$P(u)$；\n",
    "3. 得到似然函数$P(m|u)$；\n",
    "4. 根据贝叶斯公式，计算后验概率$P(u|m)$\n",
    "5. 计算$u$的期望$E(u)$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "证明步骤："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. 条件假设  \n",
    "\n",
    "&emsp;&emsp;根据朴素贝叶斯法的基本方法，训练数据集$T=\\{(x_1,y_1),(x_2,y_2),\\ldots,(x_N,y_N)\\}$，假设：  \n",
    "（1）出现$x^{(j)}=a_{jl}, y=c_k$的次数为$m_l$，即$\\displaystyle m_l=\\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k)$，可知$\\displaystyle \\sum_{l=1}^{S_j} m_l = \\sum_{i=1}^N I(y_i=c_k)$（总共有$\\displaystyle \\sum_{i=1}^N I(y_i=c_k)$个）；  \n",
    "（2）$P_{\\lambda}(X^{(j)}=a_{jl} | Y = c_k)=u_l$，随机变量$u_l$服从参数为$\\lambda$的Dirichlet分布。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. 得到先验概率  \n",
    "&emsp;&emsp;根据假设(2)和Dirichlet分布的定义，可得先验概率为\n",
    "$$\n",
    "\\displaystyle P(u)=P(u_1,u_2,\\ldots,u_{S_j}) = C(\\lambda) \\prod_{l=1}^{S_j} u_l^{\\lambda - 1}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3. 得到似然函数  \n",
    "&emsp;&emsp;记$m=(m_1, m_2, \\ldots, m_{S_j})^T$，可得似然函数为\n",
    "$$\n",
    "P(m|u) = u_1^{m_1} \\cdot u_2^{m_2} \\cdots u_{S_j}^{m_{S_j}} = \\prod_{l=1}^{S_j} u_l^{m_l}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4. 得到后验概率分布  \n",
    "&emsp;&emsp;结合贝叶斯公式，求$u$的后验概率分布，可得\n",
    "$$\n",
    "P(u|m) = \\frac{P(m|u)P(u)}{P(m)}\n",
    "$$\n",
    "&emsp;&emsp;根据假设(1)，可得\n",
    "$$\n",
    "P(u|m,\\lambda) \\propto P(m|u)P(u|\\lambda) \\propto \\prod_{l=1}^{S_j} u_l^{\\lambda+m_l-1}\n",
    "$$\n",
    "&emsp;&emsp;上式表明，后验概率分布$P(u|m,\\lambda)$也服从Dirichlet分布"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "5. 得到随机变量$u$的期望  \n",
    "&emsp;&emsp;根据后验概率分布$P(u|m,\\lambda)$和假设(1)，求随机变量$u$的期望，可得\n",
    "$$\n",
    "E(u_k) = \\frac{\\alpha_l}{\\displaystyle \\sum_{l=1}^{S_j} \\alpha_l}\n",
    "$$\n",
    "其中$\\alpha_l = \\lambda+m_l$，则\n",
    "$$\n",
    "\\begin{aligned}\n",
    "E(u_l) &= \\frac{\\alpha_l}{\\displaystyle \\sum_{l=1}^{S_j} \\alpha_l} \\\\\n",
    "&= \\frac{\\lambda+m_l}{\\displaystyle \\sum_{l=1}^{S_j} (\\lambda + m_l)} \\\\\n",
    "&= \\frac{\\lambda+m_l}{\\displaystyle \\sum_{l=1}^{S_j} \\lambda +\\sum_{l=1}^{S_j} m_l} \\quad(\\because \\sum_{l=1}^{S_j} m_l = \\sum_{i=1}^N I(y_i=c_k)) \\\\\n",
    "&= \\frac{\\lambda+m_l}{\\displaystyle S_j \\lambda + \\sum_{i=1}^N I(y_i=c_k) } \\quad (\\because m_l=\\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k)) \\\\\n",
    "&= \\frac{\\displaystyle \\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k) + \\lambda}{\\displaystyle \\sum_{i=1}^N I(y_i=c_k) + S_j \\lambda}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "&emsp;&emsp;随机变量$u_k$取$u_k$的期望，可得\n",
    "$\\displaystyle P_{\\lambda}(X^{(j)}=a_{jl} | Y = c_k) = \\frac{\\displaystyle \\sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k) + \\lambda}{\\displaystyle \\sum_{i=1}^N I(y_i=c_k) + S_j \\lambda}$，公式(4.10)得证。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 参考文献"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "【1】极大似然估计的一般步骤（来源于Wiki百科）：https://en.wikipedia.org/wiki/Maximum_likelihood_estimation   \n",
    "【2】贝叶斯估计的一般步骤（来源于Wiki百科）：https://en.wikipedia.org/wiki/Bayes_estimator"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.5"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
